Learning To Binarize Document Images

نویسندگان

  • Chien-Hsing Chou
  • Wen-Hsiung Lin
  • Fu Chang
چکیده

Document images produced by cameras often have varying degrees of brightness. To resolve the problem, we propose a method that divides an image into several regions and decides what binarization action to take on each region based on the rules that are derived from a learning process. Since each region can allow more than one action to take, we are dealing with a multi-label and multi-class classification problem that can be solved effectively by support vector machines. Tests on images produced under normal and inadequate illumination conditions show that our method yields better OCR performance than three global binarization methods and four locally adaptive binarization methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Binarization Method with Learning-Built Decision Rules for Document Images Produced by Cameras

In this paper, we propose a novel binarization method for document images produced by cameras. Such images often have varying degrees of brightness and require more careful treatment than merely applying a statistical method to obtain a threshold value. To resolve the problem, our method divides an image into several regions and decides how to binarize each region. The decision rules are derive...

متن کامل

A binarization method with learning-built rules for document images produced by cameras

In this paper, we propose a novel binarization method for document images produced by cameras. Such images often have varying degrees of brightness and require more careful treatment than merely applying a statistical method to obtain a threshold value. To resolve the problem, the proposed method divides an image into several regions and decides how to binarize each region. The decision rules a...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Binarization of camera-captured document using A MAP approach

Document binarization is one of the initial and critical steps for many document analysis systems. Nowadays, with the success and popularity of hand-held devices, large efforts are motivated to convert documents into digital format by using hand-held cameras. In this paper, we propose a Bayesian based maximum a posteriori (MAP) estimation algorithm to binarize the camera-captured document image...

متن کامل

A Two-stage and Parameter-free Binarization Method for Degraded Document Images

Binarization plays an important role in document image processing, especially in degraded documents. For degraded document images, adaptive binarization methods often incorporate local information to determine the binarization threshold for each individual pixel in the document image. We propose a two-stage parameter-free window-based method to binarize the degraded document images. In the firs...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009